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ABSTRACT 

Identifying new indications for existing drugs (drug 
repositioning) is an efficient way of maximizing 
their potential. Adverse drug reaction (ADR) is one 
of the leading causes of death among hospitalized 
patients. As both new indications and ADRs are 
caused by unexpected chemical-protein inter- 
actions on off-targets, it is reasonable to predict 
these interactions by mining the chemical-protein 
interactome (CPI). Making such predictions has 
recently been facilitated by a web server named 
DRAR-CPI. This server has a representative collec- 
tion of drug molecules and targetable human 
proteins built up from our work in drug repositioning 
and ADR. When a user submits a molecule, the 
server will give the positive or negative association 
scores between the user's molecule and our library 
drugs based on their interaction profiles towards 
the targets. Users can thus predict the indications 
or ADRs of their molecule based on the association 
scores towards our library drugs. We have matched 
our predictions of drug-drug associations with 
those predicted via gene-expression profiles, 
achieving a matching rate as high as 74%. We 
have also successfully predicted the connections 
between anti-psychotics and anti-infectives, 



indicating the underlying relevance of 
anti-psychotics in the potential treatment of 
infections, vice versa. This server is freely available 
at http://cpi.bio-x.cn/drar/. 

INTRODUCTION 

More than 90% of drug candidates fail during develop- 
ment (1), which makes pharmaceutical R&D extremely 
expensive and time consuming. Identifying novel indica- 
tions for existing drugs, or drug repositioning, can 
enhance drug safety, maximize the potential of the drugs 
and lower R&D costs (2,3). Many drugs such as sildenafil 
citrate (Viagra®) and raloxifene hydrochloride (Evista®) 
have already been repositioned for other indications 
after reports of side effects in clinical trials (4). Adverse 
drug reaction (ADR) has always been a world-wide 
concern as one of the leading causes of death among 
hospitalized patients (5,6). Since both new indications 
and ADRs are caused by unexpected chemical-protein 
interactions (7-14), which may be indirect or complex at 
the mechanism level, it is reasonable to try to predict these 
interactions based on mining the chemical-protein 
interactome (CPI). 

Understanding drug-drug associations can not only 
benefit the discovery of novel indications and therapies 
(15) but also prevent serious negative outcomes (16). 
The use of a large database of transcriptional responses 
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to identify connections between small molecules which 
share the same mechanisms and processes within 
diseases (17,18) can reveal unexpected similarities 
between drugs so as to indicate potential repositioning 
uses (19-23) or unexpected adverse reactions (24). 
Though high throughput technology such as microarray 
has the potential to generate large quantities of data for 
analyzing drug-drug associations, this methodology, 
although robust, can also be costly (25,26) while reliability 
and quality measures still need to be improved (27,28). 

Here we introduce the DRAR-CPI server, for predict- 
ing Drug Repositioning potential and ADR via CPI (29). 
This server has a comprehensive collection of the 385 
structural models of targetable human proteins and 254 
active forms of small molecules with known descriptions, 
indications and ADRs. When a user submits a molecule, 
docking programs can be applied to calculate the binding 
energy between the uploaded molecule and the targets. 
The server will give the positive or negative association 
scores between the user's molecule and our library drugs 
based on their interaction profiles across 385 human 
proteins and will also suggest candidate off-targets that 
tend to interact with it. Since our library drugs have a 
comprehensive annotation of their indications and 
ADRs, users can predict potential indications or ADRs 
based on the association scores of their molecule across 
our library molecules. We have matched our in silico pre- 
dictions of drug-drug associations with those predicted 
via gene-expression profiles, achieving a matching rate as 
high as 74% while significantly reducing time and cost. 
Information on drug-drug associations can lead to new 
indications for existing drugs, such as the application of 
anti-psychotics in the potential treatment of bacterial 
infections, vice versa. The server is freely available at 
http://cpi.bio-x.cn/drar/. 



METHODS 

Preparation of the target set and the library drugs 

Using the criteria (29,30) and preparation method (31) 
described in our previous research, we achieved 385 
pocket models of 353 proteins with known functions 
derived from UniProt. We then chose 254 active forms 
of 166 small molecules from DrugBank (24) with known 
descriptions, indications and ADR as our library drugs 
based on the collection criteria of the background drugs 
in our previous work (30). As all the proteins are human 
proteins from third-party targetable protein databases, 
and all drug molecules are from our previous study, we 
did not add any subjectively selected protein or drug based 
on our interest in drug repositioning, so as to make it a 
representative set of the background distribution for both 
proteins and drug molecules. We will continue to update 
the targets and library drugs in DRAR-CPI, and users can 
subscribe to our updates through RSS feeds. 

Preparation of the library interactome 

We prepared an in silico hybridization using the DOCK 
program (32), generating a library interactome of 254 
library ligands towards 385 protein pockets in the form 



of a docking score matrix of 254 x 385 elements. Docking 
scores > 0 were treated as missing values according to our 
previous scoring process pipeline (31). The two-directional 
Z-transformation (2DIZ) was applied to process the 
original docking-score matrix so that the docking scores 
were normalized to the direction of drugs as Z-scores and 
then to the direction of targets as Z'-scores to increase 
accuracy (31). 

Evaluation of the drug-drug associations 

When one drug is uploaded, it is 'hybridized' with all 
targets using the DOCK program (32). The docking 
scores of all the library drugs plus the uploaded drug 
towards all the targets are transformed into a matrix of 
Z'-scores containing 255 x 385 elements for the calcula- 
tion of the enrichment score. We developed an algorithm 
based on connectivity analytics (17) to calculate an asso- 
ciation score S' and a P-value between the uploaded drug 
and each library drug i. For one uploaded drug, after 
2DIZ (31), we treat the targets towards the uploaded 
drug with a Z'-score <— 1 as the favorable targets and 
those with Z'-score >1 as unfavorable targets. Both the 
favorable and unfavorable targets construct the query sig- 
nature at this stage. For each library drug, say drug i, we 
compute an enrichment score for the set of favorable or 
unfavorable targets in the signature, ks l and ks' dmrn , re- 
spectively. To calculate ks' up , we set n as the total number 
of all the targets and t as the number of the favorable 
targets. We sort the favorable targets by Z'-scores 
towards the library drug i in ascending order and get 
their positions (1 . . . t) as list T. Then we sort all the 
targets in the same way and get their positions (1 . . . n) 
as list N. For each favorable target, we get its position 
in list T as j and its corresponding position in list N as 
v, and calculate the following values: 



J _ v 
t n 
v_l 
n 



a, if a > b or ksl 



-b if b > a. Then calculate 



a = max 

b — max 

M 

Set ks' „, „ „ , — . .„„ up 
ks' down for the unfavorable targets in the same way. Set the 
association score S' = 0 if ks' up and ks' doyjn have the same 
algebraic sign. Otherwise, set s' = ks' up — ks' Aown . Scan 
across all the library drugs and get the maximum and 
minimum of s' as s' mm and s' min , respectively. Set the asso- 
ciation score S' = s'/s' when s*> 0 or S l = — s'/sL- when 

' iiidA . ' nun, 

s' < 0. The association score S' is calculated from ks' up and 



ks' dov/n and the P-value is calculated using the 
Kolmogorov-Smirnov statistic. 



INPUT AND OUTPUT 

Users need to upload a drug molecule in mol2 format with 
charges and hydrogens added. When the user submits a 
drug molecule, our server checks the format suitability 
and calculates the interaction profile of this drug 
towards all the targets in the database using DOCK6 
(32). The parameters of DOCK6 used in back end are 
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listed in Supplementary Table SI, inherited from our 
previous experience of constructing CPI (29-31). Users 
can view the real-time progress online, and the page 
showing the current docking status of the uploaded drug 
will also be provided for bookmarking. It takes between 6 
and 20 h to finish a one-molecule task and an email will be 
sent on completion. The outputs comprise the two follow- 
ing major elements: 

(i) Library drugs which share similar (or opposite) 
interaction profile with the user's molecule, ranked 
by the similarity (or disparity) with known indica- 
tions and ADR information, suggesting the 
underlying new indication and ADR of the user's 
molecule. 

(ii) The candidate off-targets that tend to interact with 
the user's molecule. The server will visualize the 
drug-protein interactions, with amino acid residues 
around 6 A of the molecule highlighted. 



RESULTS 

Prediction of the drug-drug associations 

Drug-drug associations based on gene-expression profiles 
can be used in drug repositioning (19). To test whether the 
CPI-based docking score profiles corresponded with drug- 
drug associations, we input the drugs used by Lamb et al. 
(17) in our server to generate query signatures and 
compared our result with their reports as the gold 
standard to test the associations among the drugs. Of 
the 87 associations in our server, we found that 64 asso- 
ciations (74%) matched with the correlations indicated in 
their article (Supplementary Table S2). 

The data sets used by us are completely independent as 
we inherited our method from the algorithm in 
Connectivity Map (cMap) for connectivity analytics 
without training it with any experimental data. To 
evaluate our method, the 74% matching rate towards 



true positive of the gene-expression profiles measured 
the sensitivity; however, the specificity can not be accur- 
ately scaled since it is difficult to define which two drugs 
are totally not associated with each other. 

Case study 1: predicting drug-drug associations for 
rosigliazone 

Rosiglitazone is an anti-diabetic drug of the 
thiazolidinedione class. Under the stimulation of insulin, 
it binds to the peroxisome proliferator-activated receptors 
(PPARs) in fat cells to make the cells more responsive 
(33). To find the potential indications and ADRs for 
rosiglitazone, we uploaded an active form of the drug 
and checked the results (Table 1). We found that 

(i) The drug sharing the closest similarity (association 
score of 1 and P-value 0.0270) to rosiglitazone is 
fulvestrant, a known anti-estrogenic drug (34), 
which is used in the treatment of hormone 
receptor positive metastatic breast cancer in 
post-menopausal women (35). As rosiglitazone's 
binding target PPARy is significantly related to 
human primary and metastatic breast adenocarcin- 
omas (36), our server suggested a new indication for 
rosiglitazone in the treatment of breast cancer. 

(ii) The seventh nearest drug to rosiglitazone is 
pravastatin (association score —0.909 and P-value 
0.0590), which is used in the treatment of hyperchol- 
esterolemia to reduce the risk of myocardial infarc- 
tion. Since rosiglitazone was associated with a 
significant increase in the risk of myocardial infarc- 
tion (37), this opposite association suggested the po- 
tential ADR of rosiglitazone for causing myocardial 
infarction. 

(hi) After clicking on the 'CPI' button, we see that using 
Z'-scores our server ranked PPARy to the top. 

This prediction provides clues for further studies on 
rosiglitazone's new therapeutic applications to breast 
cancer as well as a warning of rosiglitazone's ADR in 



Table 1. Associations of library drugs towards rosiglitazone 



Rank 


Library drug 


Indication 


ADR 


Association 
score 


P-value 


1 


Fulvestrant 


For the treatment of hormone receptor positive metastatic breast 
cancer in post-menopausal women with disease progression 
following antiestrogen therapy. 


N/A 


1 


0.0270 


2 


Geldanamycin 


N/A 


N/A 


-1 


0.0742 


3 


Rosiglitazone 


For the treatment of Type II diabetes mellitus 


LongQT 


0.977 


0.0000 


4 


Risperidone 4 


For the treatment of schizophrenia in adults and in adolescents, 
ages 13 to 17, and for the short-term treatment of manic or 
mixed episodes of bipolar I disorder in children and adolescents 
ages 10 to 17. 


Rhabdomyolysis 


-0.939 


0.1215 


5 


17-allylamino-17- 
demethoxygeldanamycin 


N/A 


N/A 


-0.934 


0.1066 


6 


Galantamine 2 


For the treatment of mild to moderate dementia of the Alzheimer's 
type. 


N/A 


-0.931 


0.0122 


7 


Pravastatin 2 


For the treatment of hypercholesterolemia to reduce the risk of 
myocardial infarction. 


Rhabdomyolysis 


-0.909 


0.0590 



Seven drugs are ranked by association scores at the top of the list. 
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relation to myocardial infarction. If these data are con- 
firmed by specific experiments, the manufacturer would 
have a more comprehensive guide as to which indication 
is appropriate and whether to redesign or modify the drug 
to weaken unexpected bindings on off-targets. With this 
information, the drug development process can be made 
quicker and less costly and unexpected lawsuits can be 
avoided. 

Network analysis of drug associations 

Based on the fact that the drug-drug associations pre- 
sented by docking results matched the correlations 
indicated by gene-expression profiles in as many as 74% 
of cases, we selected the pair-wise associations among all 
254 molecules in our library and applied thresholds for 
association scores of >0.6 and P-value <0.05. We then 
visualized the remaining associations in a network 
layout using a force-directed method based on association 
scores (Figure 1). 

Case study 2: the potential for repositioning anti- 
psychotics as anti-infectives 

Five phenothiazine (chlorpromazine, fluphenazine, pro- 
chlorperazine, thioridazine and trifluoperazine) and two 
non-phenothiazine (haloperidol and clozapine) 
anti-psychotics showed positive connections in terms of 
gene expression profile (17). From our network, we 
found all seven drugs tightly clustered (Figure 1, shown 
in red). In addition, based on the edges at which they 
connected to other nearby drugs, we also found four 
other anti-psychotics (chlorprothixene, droperidol, 
olanzapine and risperidone) holding the same anatomical 
therapeutic chemical (ATC) code N05A (38^44) (Figure 1, 
shown in red). Among the 11 typical anti-psychotics 
(structures are shown in Supplementary Figure SI), 
6 anti-psychotics, including chlorprothixene, clozapine, 
droperidol, haloperidol, olanzapine and risperidone are 
non-phenothiazines, which have distinct structures. 
Furthermore, propericiazine at the left of this cluster is 
used as adjunctive medication in some psychotic patients 
(http://www.drugbank.ca/drugs/DB01608), which is also 
highly related to anti-psychotic treatment. By using the 
drug-drug associations predicted by our server, we suc- 
cessfully recalled 11 anti-psychotics and one medication, 
indicating that new drug molecules falling within this 
cluster might have an effect in the treatment of psychology 
disorders. 

The cluster of the anti-infectives (ATC code SOI A, 
Figure 1, shown in blue) is close to the anti-psychotics, 
while six out of the seven anti-infectives are 
aminoglycosides (gentamicin, streptomycin, netilmicin, 
amikacin, kanamycin and tobramycin, ATC code J01G). 
Anti-psychotic agent prochlorperazine is reported as 
offering powerful antimicrobial activity against 157 
strains of bacteria in vitro (45), and prochlorperazine 
and chlorpromazine can typically reduce by > 1000-fold 
the minimum inhibitory concentration (MIC) for 
aminoglycosides in their synergistic interactions against 
Burkholderia pseudomallei, the causative agent of melioid- 
osis (46), suggesting a potential novel therapeutic 



treatment for drug-resistance in bacterial infections. 
Using the drug-drug associations predicted by our 
server, we found a novel application of anti-psychotics 
for anti-infective treatment, vice versa, implying potential 
connections between two fields of the drugs on both ap- 
plications and mechanisms. 



DISCUSSION 

Both chemical-protein interactions and gene expression 
changes reflect how drug/chemicals perturb biosystems. 
Gene expression change is a downstream event; 
however, the chemical-protein interactions are the 
primary step when drugs enter biosystems. In this study, 
without mining the microarray data, we demonstrated the 
power of CPI to represent the perturbation towards the 
biosystems and how it would be used in measuring drug 
effect in terms of indication and drug adverse effect. On 
the other hand, in silico discovery of associations among 
the interaction profiles of small molecules can be efficient 
and cheap, and can achieve a high rate of accuracy by 
matching predictions to gene-expression profiles. 
Furthermore, the putative targets could also be prioritized 
for unexpected interactions, and sent for further wet-lab 
validation. 

Knowledge of the existing is important to find clues for 
the new (48). Drug repositioning combined with the pre- 
vention of ADR is a major preoccupation for the manu- 
facturers. Discovering drug-drug associations based on 
CPI is a novel method which can be simultaneously 
applied in the prediction of both repositioning and 
ADRs. With the in silico predictions of potential associ- 
ations among drugs, researchers may not only find helpful 
clues for exploring potential mechanisms, but could also 
save significant time and cost in safely repositioning 
existing drugs for new indications, or predicting potential 
ADRs. Uncovering associations among molecules as a 
means of understanding intricate biological systems is 
consistent with the current trend of '-omics' analyses. 

This server is to serve as a complementary methodology 
for the analysis of gene-expression profiles in drug repos- 
itioning. By applying this method into CPI in combination 
with our previous algorithm 2DIZ, we evaluated the drug- 
drug associations based on their interaction profiles to 
indicate potential therapies and ADRs. The advantage 
of our method is the low cost of harvesting 
high-dimensional data in silico instead of in vitro while 
the outcome can still be predictive. We found the predic- 
tions of our server indicated some information that fail to 
be revealed in cMap. Here is an example. 

Estradiol is a sex hormone which can be used in the 
therapy of hormone replacement while minocycline is bac- 
teriostatic antibiotic for the treatment of infections by 
microorganisms. From a query signature generated by es- 
tradiol, minocycline showed no connectivity (connectivity 
score = 0) in Result S2 of cMap (17). However, after 
querying our server by estradiol, both two forms of 
minocycline achieved positive association scores (0.310 
and 0.373, respectively). Since the two drugs showed 
antioxidative abilities of lipid peroxidation inhibition 
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Figure 1. Drug association network. The drugs are clustered using Cytoscape (47) and employing a force-directed method based on association 
scores. Partial nodes are coloured according to ATC codes. Five phenothiazine anti-psychotics (chlorpromazine, fluphenazine, prochlorperazine, 
thioridazine and trifluoperazine) and six non-phenothiazine anti-psychotics (chlorprothixene, clozapine, droperidol, haloperidol, olanzapine and 
risperidone) are retrieved by our server (shown in red circles, ATC code N05A). Seven anti-infectives are nearby (shown in blue circles, ATC 
code S01A), while six of them are aminoglycosides (gentamicin, streptomycin, netilmicin, amikacin, kanamycin and tobramycin, ATC code J01G). 
Background nodes and edges are hidden in the bottom image. The associations revealed potential novel applications for the anti-psychotics and 
anti-infectives. 
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and DPPH radical scavenging (49), contributed to 
hormone-modulated anabolic responses in fibroblasts 
after adjunctive periodontal treatment (50) and could be 
used in the treatment for prevention of ovariectomy 
reduced bone mineral density (51), our server successfully 
predicted the positive association between them for poten- 
tial indications. 

This server is independent and different from that in 
our previous project, SePreSA (31), in terms of both 
method and purpose, since the current project is con- 
cerned with drug repositioning by searching for similar 
(or opposite) drugs using associations, while the earlier 
work focused on populations susceptible to Serious 
Adverse Drug Reaction (SADR) by searching for poten- 
tial patient-specific targets using polymorphisms within 
the binding pockets. 



CONCLUSIONS 

(a) The main function of the DRAR-CPI server is to 
evaluate associations between the user's uploaded 
drug and the library drugs based on their docking 
profiles towards the putative targets so as to provide 
suggestions on potential indications and ADRs of 
the user's drug. The accuracy of the drug-drug 
associations is evaluated by recalling drug-drug 
associations based on gene expression profiles. 

(b) Researchers can not only identify the putative targets 
towards their drugs of interest, but also view a sug- 
gested prioritization of potential indications and 
ADRs with a given confidence value. An extensive 
range of decisions can be made from the pool of 
information thus improving the efficiency of R&D. 

(c) Unexpected associations can be revealed thereby 
advancing the understanding of the underlying mech- 
anisms of different kinds of interactions and poten- 
tially indicating novel treatments. 



SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online. 
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